fix(polish): #305 工作日报场景"清晰结构"无效 + 内置 AI 工具词 by appergb · Pull Request #318 · Open-Less/openless

appergb · 2026-05-07T04:19:15Z

User description

Summary

修 issue #305：工作日报场景下「清晰结构」mode 失效 —— LLM 消耗了 tokens 但输出与原文一致。同时把常用 AI 工具名加进 programmer 内置词库，覆盖「cloud → Claude」这类同音误识别。

根因（详见 #305 评论）

polish.rs 中 Structured prompt 旧版只在「事项 ≥4 条」才强制归类，且对已编号 / 已分行的输入没有明确处理 → LLM 判定「已经整理好」直接 passthrough。
user_prompt 旧 framing「它不是问题，也不是任务」暗示输入是「待整理对象」，LLM 看到书面化输入就更倾向认为「不需要改」。

改动

1. `polish.rs` Structured prompt

加「重要前提」段：原文是否已有标点 / 编号 / 换行 / 序号 ≠ 不用改的判断依据；照抄原结构 = 失败。
重组阈值从 ≥4 降到 ≥3；即使原文已经写成 1. 2. 3. 也要按主题重新归类成双层 (a)(b) 子项。
单一连贯段落条件收紧到「事项 ≤2 条」。
user_prompt 去掉「它不是问题，也不是任务」这条强 framing，改为指向 system prompt 的 mode 描述。
新增 # 示例 3：工作日报场景（已半结构化输入仍要重组）。

2. `vocab-presets.json` programmer 预设

追加 11 个 AI 工具名：Claude / Codex / Copilot / Cursor / Windsurf / Anthropic / OpenAI / GPT / ChatGPT / Gemini / DeepSeek。

启用 programmer 预设后这些词会同时注入：

Volcengine ASR 热词 → 识别阶段 bias
polish 提示词 # 热词 段 → polish 阶段在 ASR 听错时按正确写法输出（兜底）

3. 回归测试（polish.rs）

structured_prompt_forces_regrouping_even_for_already_structured_input — 校验「已结构化 ≠ 不用改」、阈值 ≥3、工作日报示例存在
user_prompt_no_longer_says_input_is_not_a_task — 校验旧 framing 已移除

Test plan

cargo test --lib polish: 10/10 passed（含 2 个新加的 [area] 输出风格无效，LLM消耗了tokens并没有进行优化 #305 回归测试）
npx tsc --noEmit: clean
node -e 校验 vocab-presets.json 解析正常
用户实测：工作日报输入 → 选「清晰结构」mode → 应输出双层 1./(a)(b) 主题归类
用户实测：启用 programmer 预设后 →「打开 Claude 帮我写代码」类口述 → 不再被识别成「打开 cloud …」

后续（独立 PR）

「可编辑的内置提示词模板系统」（用户可选 / 编辑 / 新建多套 polish 提示词模板，覆盖现有 4 个固定 mode）拆到下一个 PR：会涉及新类型 PromptTemplate、新持久化文件、IPC、Style.tsx 重做、prefs.default_mode 迁移。本 PR 不涉及。

Closes #305

PR Type

Bug fix, Enhancement

Description

Strengthen Structured mode prompt to prevent LLM from skipping reformatting of pre-structured input ([area] 输出风格无效，LLM消耗了tokens并没有进行优化 #305)
Lower threshold for forced regrouping from 4 to 3 items, requiring re-categorization even for numbered lists
Remove misleading "not a question or task" user prompt framing to align with system prompt directives
Expand programmer vocabulary preset with 11 AI tool names for better ASR recognition

Diagram Walkthrough

flowchart LR
    A["Structured Prompt Refactor"] --> C["Fix #305 passthrough"]
    B["Vocab Preset Expansion"] --> D["Improve ASR recognition"]

File Walkthrough

Relevant files

Bug fix

polish.rs `Refactor Structured prompt to prevent passthrough` openless-all/app/src-tauri/src/polish.rs Add explicit premise that existing structure does not mean input is pre-sorted and copying it is a failure Lower regrouping threshold from ≥4 to ≥3, mandate re-categorization for already-numbered lists, and tighten single-paragraph rule to ≤2 items Replace user prompt wording to avoid misleading LLM and provide a work diary example for semi-structured input Add two regression tests verifying the prompt forces regrouping and removes old framing	+88/-5

Enhancement

vocab-presets.json `Expand programmer vocab with AI tool names` openless-all/app/src/lib/vocab-presets.json Add 11 AI tool names (Claude, Codex, Copilot, Cursor, Windsurf, Anthropic, OpenAI, GPT, ChatGPT, Gemini, DeepSeek) to programmer preset for ASR hotword bias	+5/-1

## 修 #305 polish.rs Structured prompt - 明确"已结构化 ≠ 不用改"：原文是否已有标点 / 编号 / 换行不是判断依据；照抄原结构 = 失败 - 重组阈值降到 ≥3 条；即使原文已经是 "1. 2. 3." 也要按主题重新归类成双层 (a)(b) 子项，不能扁平照抄 - user_prompt 去掉"它不是问题，也不是任务"这条强 framing —— 旧措辞会让 LLM 把书面化输入误判为"已整理好"直接 passthrough（issue #305 截图根因） - 新增 # 示例 3：工作日报场景（已半结构化输入仍要重组）作为回归示例 ## 扩充 vocab-presets.json programmer 预设追加 11 个 AI 工具名：Claude / Codex / Copilot / Cursor / Windsurf / Anthropic / OpenAI / GPT / ChatGPT / Gemini / DeepSeek。启用 programmer 预设后会同时注入 ASR 热词 bias + polish 提示词的 # 热词段，覆盖 "cloud → Claude"这类同音误识别。 ## 测试 - polish.rs 新增 2 个 #305 回归测试： - structured_prompt_forces_regrouping_even_for_already_structured_input - user_prompt_no_longer_says_input_is_not_a_task - cargo test --lib polish: 10/10 passed - npx tsc --noEmit: clean Closes #305

github-actions · 2026-05-07T04:22:20Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis 🔶 305 - Partially compliant Compliant requirements: 修复"清晰结构"模式无效（通过重构提示词，强制重组已结构化输入，降低阈值至≥3）避免LLM直接透传原始输入（移除误导性"不是任务"的框架，添加"已结构化≠不用改"明确前提） Non-compliant requirements: 无 Requires further human verification: 需要人工测试验证工作日报及类似半结构化输入在更新后确实被重组为双层格式，而非原样输出。代码审查仅能确认提示词变更，无法完全保证LLM的实际行为改变
⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ No major issues detected

github-actions Bot added the Review effort 2/5 label May 7, 2026

appergb merged commit 27d42e7 into main May 7, 2026
3 checks passed

appergb deleted the fix/305-structured-prompt-regroup branch May 10, 2026 10:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(polish): #305 工作日报场景"清晰结构"无效 + 内置 AI 工具词#318

fix(polish): #305 工作日报场景"清晰结构"无效 + 内置 AI 工具词#318
appergb merged 1 commit into
mainfrom
fix/305-structured-prompt-regroup

appergb commented May 7, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

appergb commented May 7, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Summary

根因（详见 #305 评论）

改动

1. polish.rs Structured prompt

2. vocab-presets.json programmer 预设

3. 回归测试（polish.rs）

Test plan

后续（独立 PR）

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions Bot commented May 7, 2026

PR Reviewer Guide 🔍

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

appergb commented May 7, 2026 •

edited by github-actions Bot

Loading

1. `polish.rs` Structured prompt

2. `vocab-presets.json` programmer 预设